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COMPOSITIONS AND METHODS FOR PROTEIN 
PURIFICATION BASED ON A METAL ION AFFINITY SITE 
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BACKGROUND OF THE INVENTION 

Cross Reference to Re ined Applications 

This application claims benefit of non-provisional 
15 application US Serial number 09/078.687 filed May 14, 1998. 

FjeJH of the Invention 

This invention relates generally to the field of protein 
chemistry. Specifically, the present invention relates to protein 
20 purification using a metal ion affinity site. 

Rackpround of the Invention; 

Development of protocols for the isolation and 
purification of proteins is often a long and costly process. Such 
25 protocols usually contain multiple steps, where some of the steps 
have recoveries as low as 50%. Further, due to variation between 
protein molecules, a purification protocol developed and effective 
for the purification of one protein is not necessarily useful for the 
purification of another. In fact, in most cases considerable 
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adaptations must be made to a purification protocol to 
accommodate the various physical and chemical characteristics of 

different proteins. 

The ability to prepare hybrid genes by genetic 
engineering technology has opened up new possibilities for the 
purification of proteins. For example, one can link a DNA 
sequence of a protein of interest to a nucleic acid sequence which 
codes for a peptide which has a high binding affinity for a specific 
ligand. The fusion protein product resulting from expression of 
this DNA has attributes of both the protein of interest and the 
high affinity peptide. To purify or immobilize the engineered 
fusion protein, the ligand commonly is linked to a support, and the 
unpurified, engineered protein is then exposed to the 
ligand/support composite and allowed to bind. 

There are numerous advantages of using a high 
affinity fusion protein. For example, the use of an affinity peptide 
ensures that no part of the native protein of interest is involved in 
adsorption-the binding between the fusion protein and the ligand. 
At the same time, extremely high selectivity in the adsorption 

process is achieved. 

Immobilized Metal Ion Affinity Chromatography 
(IMAC) is one of the most frequently used techniques for 
purification of fusion proteins containing affinity sites for metal 
ions. Proper choice of immobilized metal ion, loading conditions 
and elution conditions can result in protein purification of up to 
about 95-98% in a single chromatographic step. Moreover, 
recovery generally is higher than 85%. In addition to the 
advantages discussed above, incorporation of a proteolytic, 
chemical, or enzymatic cleavage site into the composite DNA, 
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between the affinity peptide and the sequence of the protein of 
interest, provides a means for cleaving the affinity peptide from 
the protein of interest to yield the native protein of interest in 

highly purified form. 

5 The following publications are representative of the 

art: Itakura, et al., Science 198:1056-63 (1977); Germino, et al., 
PNAS USA 80:6848-52 (1983); Nilsson et al., Nucleic Acid Res. 
13:1151-62 (1985); Smith et al., Gene 32:321-27 (1984); Dobeli. et 
al., U.S. Pat. No. 5,284,933; and Dobeli, et al., U.S. Pat. No. 

10 5,310,663. 

The prior art is deficieint in improved compositions 
and methods for affinity immobilization and purification of 
proteins. This invention fulfills this long-felt need in the art. 
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The present invention relates to compositions and 
methods for protein purification involving the use of novel, 
genetically-engineered fusion proteins. These fusion proteins are 
engineered to allow for immobilization and purification via the 
high affinity interaction of an affinity peptide of a fusion protein 
with a ligand. The affinity peptide is a histidine-rich polypeptide 
sequence with a general sequence: (HX n W wherein His histidine, 
X is an amino acid other than histidine, n= 1-8, m= 2-30, and 
wherein if n=l for more than two adjacent units of HX, at least one 
X must be asparagine, phenylalanine, tryptophan, tyrosine, lysine, 
methionine, arginine, glutamine, or cysteine. The affinity peptide 



3 



PCTAJS99/10662 

WO 99/57992 

is linked to the proteins of interest Rl and R2 to yield a fusion 
protein with formula Rl-(HX n)m -R2. In a preferred embodiment of 
the invention, n=l-4 and -2-10. In a more preferred 
embodiment of the invention, n=l-4 and m=3-6. In a speciftc 
5 embodiment of the invention, a fusion protein havxng the 
sequence SLKDHLIHNVHKEEHAHAHNKISVVGVGAVGM (SEQ ID 
No-6) is provided. In another embodiment of the present aspect 
of the invention, at least one protease cleavage site is inserted 
between the sequence of the protein of interest and the sequence 

10 of the affinity peptide. 

In another aspect of the invention, there is prodded a 
DNA sequence coding for a fusion protein comprising a protein of 
interest fused at its amino-terminus or carboxy-terminus to at 
least one affinity peptide, where the fusion protein has the 
15 ceneral formula Rl-(HX n)m -R2, wherein Rl or R2 is the protein of 
interest, H is histidine, X is an amino acid other than histidine, n- 
L8 m= 2-30, and wherein if n=l for more than two adjacent 
units of HX, at least one X must be asparagine, phenylalantne. 
tryptophan, tyrosine, lysine, methionine, arginine, glutamme. or 
20 cysteine. In a specific embodiment of this aspect of the invention, 
there is provided a DNA sequence which codes for a protein where 
the fusion Protein has the sequence 

SLKDHLIHNVHKEEHAHAHNKISWGVGAVGM (SEQ ID No:6). 

In various embodiments of this aspect of the 
25 invention, there is provided a recombinant vector comprising an 
expression vector and a DNA sequence coding for a fusion protetn 
comprising a protein of interest fused at its amino-terminus or 
carboxy-terminus to at least one affinity peptide as descnbed 
ab0 ve. wherein the recombinant vector is capable of directxng 
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expression of the DNA sequence in a suitable host organism. The 
present invention also provides a host organism containing a 
recombinant expression vector comprising a DNA sequence coding 
for a fusion protein comprising a protein of interest fused at us 
5 amino-terminus or carboxy-terminus to at least one affinity 
peptide as described above, wherein the organism is capable of 

expressing said DNA sequence. 

In yet an additional aspect of this invention, there is 
provided a method for purifying the novel fusion proteins of the 
10 present invention, comprising the steps of: contacting a prote.n 
sara ple containing the fusion protein in a mixture with other 
proteins with a metal chelate resin under conditions where the 
fusion protein binds to the resin to produce a resin-fusion protein 
complex; washing the resin-fusion protein complex with a buffer 

fusion protein from the washed resin-fusion protein complex. One 
emb odiment of this method includes inserting at least one 

affinity peptide, and cleaving the protein of interest from the 
20 affinity peptide after purification using the metal chelate resin. 

Other and further aspects, embodiments, features and 
advantages of the present invention will be apparent from the 
following description of the invention. 
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BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 is a schematic representation of the 
pGFPuv/HAT vector. This vector contains one embodiment of the 
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affinity peptide of the present invention fused to Che N-.ertni.us 
:"T uv l.u„, of Green Foreseen, Protein. •WLJ-MSS 

restticuonsites are noted. 
^TTnTITchematic represenr.tton of the 

5 P UC19/HS, con.atn.ng pan of <he affinity P«P.ide (*F) at the N- 
of En— <BK, eieavage site fo.iowed hy muL.p.e 

cloning site (MCS). O^rS^SSBS^SlJBj!!^ 

8 Figore riTTschematie r.7reaet.u U o„ of the 
restriction maps of a veotot -id. three frame shifts containing 
10 pi of the affinity peptide and —as. Ceavag. sue that ,s 
us ed for expression of reoombinant protetns. 

DETAILED DESCRIPTION OF THE INVENTION 

15 • relates to compositions and 

The present invention relates 

me ,hods for potation of novo,, e enerica,.,engin.e,ed fuston 
pt o,eins. Immobilisation and purification is aohteved M, a h h 
n f an affinity peptide portion of the tusion 
nffinitv interaction of an aiimny f v r 

proteins of .uteres. M or «. * ^ ^ ^ 

acW other Utan hisfidine, n= * m-t* ^ ^ 

more than two adjacent units of HX, 1» 
25 asparagine. phenylalanine, tryptophan ^ 
Conine, arginlne. giuumiue. « cys,«n. Th* , 

purification of the fusion protein. 
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The affinity of the high affinity peptide it to, 
im u„hi.ized tnetal ions. The srrength of hinding berween the high 
affinity peptide and an approprlare metal ion is »ety h,gh ; thus. 
lsol a,ion of the fusion ptoteins is very selective. However, 
5 essoeiation hetween the peptide and the ligand is also reversthl, 

wirh the metal ion ligand, the protein can he dictated or 
elut ed from the me... ion/adsorh.n. hy addition of eompe ttrve 
Hgand snch as imidazole, or hy decreasing the pH, which eads o 
,„ p Lnauon of the nitrogen in the imidazole ring of the h.sfid n 
ide chain and ,ele. S e of the adsorbed protein. Because of h, 
„ve,sihili,y. >h. prorem is recovered in a purified, unbound ., m 
Former, regeneration and re.se of the metai tonfadaorhen or 
supP or. multiple times-even more than .00 
„ An additional feature of rhe protein pur.fiearron and 

im mobi,iza,ion technic hased on the principles of the presena 
inve „fion is the high prohahility rhat rhe purified and regen.ra d 
p„,eiu of inreresr wi» retain full biological ac«v«y • 
pacific..,. This is heeause the affinity peptide . **- 
20 lmmohi.iz.rion/hinding process where rhe porfion of d. 
prorein rhat conuins the ptotein of inreresr is nor. 

lueorporation of a pro.eoly.ic sire herween .he h.gh 
affinity pep-ide and .he se q ne„ce of the pro,ein of inrerea. 
prides the means to regenerate the ptotein o, .uteres, from rh 
25 Lion prorein. Kegenemrion is achieved hy Umited 

the Jon prorein and a second chromarograph, srep ,. 
pr.teoly.ic produer is passed rhrough an immohrhzed metal .on 
, column, .udeed, one can udlize the same column as was 

„ se d L immohilize and purify *e fusion protein. .» - aeeond 
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chronograph, srep fonowing pro.eo.ysis, the Ceoved 

JmobiUzation or adsorption, whereas the h ig h aff.nuy pepr.de rs 

adsorbed on Ihe column. ,„„,„, 
One embodiment of the presenr .nv.nt.on fea.ures 
„uo. of uucleic acid fences which coda for secreUon 
sig J s hrro .he DNA sequence .ha, codes for .he fuston 
Such secredon signais cause me fusion pro.ein .o be 
,„ e media after synrhesis in a bos, can. Since a 
10 11 of .o.», cafhrfar pro* remains in .he coU, secreuon 
mpr „ve S dremcicany .be isoiadon and puriHCion of .he fus.on 
plin b, e.iminaring me need for ee„ disrupt, prore.n 
Ilcdon and/or remova, of unwarned ceihdar component and 

nucleic acid. »u\oh 
As used herein, the terms "affinity pepude or high 
refer to a histidine-rich polypeptide with a 
affinity peptide refer to a !nt<M . e<5t ri 

n.P fHX ) which is linked to proteins of interest Kl 
seneral sequence (HA n ; m WIUUI 

g x is an amino acid other than 

or R2; wherein H is histidine, X is an ami 

18 m- 2-30 and wherein if n=l for more than two 
nisudine, n= -B rn-^30 ^ ^ 

90 adjacent units ot HA, ai 

phenyiaUnine. .ryprophan, iyrosinc. iysine, merhionine. ar g ,n.ne. 

^ Itrlerein, rhe ,.rm "pro* of in.eres, aha,, 

25 purpose of purification or immobiliaauon. 

L used herein, the term "fusion protein" shaU refer to 
protein hybrid containing the affinity peptide and the pro.ein 
of interest or any amino acid seance o, .uteres, 1* i™» 
pr oteiu has me genera, formuU R,-(HX„)„-R2, «h«re,n Rl or K2 
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prorein of inures, H is histidtne. X is an amino acid other rhan 
JL - - 2-30. .no wherein » - * — *» »° 
uujacen, units of HX, a, leas, one X must b. aspara . 
pl yl a..ni».. tryptophan, tyrosine, lysine, methtontne, -I—. 

5 glutamine, or cysteine. 

As used herein, .he terms "secretton sequence or 
"secretion signal sequence" shai, refer to an amino acid sign., 
sequence which Lads to the transport of a protein contatntng 
si8 „ a , sequence outside One ce» membrane. .» ft. present case. 
10 fusion p orein of the presen, invention may contatn such a 
,1 sequence ro enhance and simplify pm-tfrcattort. 
secretion sequence iu 

Reptesentative examp.es of secretion sign,, sequences are we,, 
k „own to thoae having ordinary sHll in this art. 

As used herein, me rem "proteolytic sue shall refer 
15 , 0 an, amino ac.d sequence recognized hy any proteo.ytic e.zym. 
„ the present case, a fusion protein of the present rnventton m y 
lain such a proteose site hetwean the protein of .mere 
- affinity peptide andtot other amino acid ^ ^ 
protein of interest may be separated eastly 
20 heterologous amino acid sequences. 

A s used herein, the term "metal ton tefe.s to any 

meta, ion for which ,he,aff,ni„ peptide, has affinity and that can 
be used for purification of- tmmobilizarton of a fustpn p ote.u. 

^ ^- c -"- ac ;; mj » „ 1 : , 

,, JT^T^. 'ha = 55ra555 * 

support" sha„ refe, to a chtomatogtaphy - .mmob.,iaa,io„ 
medium used to immobilize a metal ton. 

As used herein, the term "regeneration", in the context 
ot ,„e fusion protein, shall refer to the process o, separaring or 



10 



WO 99/57992 PCT/US99/10662 

eliminating the affinity peptide and other heterologous amino acid 
sequences from the fusion protein to render the protein of 
interest after purification. 

So that the matter in which the above-recited features, 
advantages, and objects of the invention become clear and can be 
understood in detail, particular descriptions of the invention may 
be had by reference to particular embodiments described in the 
Examples below; however, the following description and examples • 
are given for the purpose of illustrating various, specific 
embodiments of the invention, and are not meant to limit the 
scope of the invention in any fashion. 



EXAMPLE 1 



15 



20 



Ssaactjfl B 2f lasiate H<,hvdnwnase from ghirfrp breast muscle 

A naturally-occurring peptide sequence from the N- 
terminus of lactate dehydrogenase (LDH) from chicken muscle 
(Callus gallus) was used for initial experiments. The protein 
includes a stretch of approximately 30 amino acids which has a 
sequence consistent with the general formula of the fusion protein 
of the Present invention 

(SLKDHLIHNVHKEEHAHAHNKISWGVGAVGM (SEQ ID No:6)). 
. Further, LDH has the feature that the enzyme itself can be assayed 
25 easily for activity. Thus, the naturally-occurring chicken muscle 
LDH served as a "fusion protein" for these experiments in the 
sense that it contained both a high affinity peptide and a protein 
of interest. 
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Extraction of chicken breast muscle LDH was 
performed by cutting 15 g of frozen chicken muscle, free of blood 
vessels, into small pieces and transferring the material to a 
commercial blender along with 150 mL of extraction buffer (50 

5 mM sodium phosphate, 1 mM EDTA, 1 mM magnesium acetate pH 
7.5, 1 mM 2-mercaptoethanol (0.2 L) stored for at least 3 0 
minutes at 4°C). The mixture was homogenized twice at 4°C for 3 0 
seconds, with a 10-minute pause between the bursts. After the 
second homogenization, the mixture was transferred to centrifuge 

l0 tubes and centrifuged at 4°C and 10,000 x g for 30 minutes. The 
clear supernatant was collected and used as a starting sample for 
the purification of lactate dehydrogenase. 



EXAMPLE 2 



E aiifii af lact ate djtodmgsiiass ™ NidD-cheiating 

g<T harose FF 

Lactate dehydrogenase was purified by IMAC m the 
20 following manner: approximately 5 mL of Chelating Sepharose FF 
(Amersham, Pharmacia) was transferred to a vacuum bottle, 
di ,uted with an equal volume of deionized water and degassed 
under vacuum for 10 minutes. The gel suspension was poured 
into a column (10x1 cm. i. d.) trapped on the bottom with a 
25 degassed adapter and left to settle. The column was then filled to 
the top with degassed deionized water, and a top adapter was 
gently pushed down the column bed until there was no space 
between the top surface of the gel and the adapter. The column 
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was washed with 3 column volumes of deionized water at a flow 

rate of 0.5 mL per min. 

The chicken muscle extract (14 mL) was equilibrated 
by gel filtration on Sephadex G-25 columns with equilibration 
5 buffer (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.06 M imidazole pH 7.0 (1 L)). The IMAC column 
was then charged with Ni(II) ions using 20 mL of a 0.02M 
Ni(N0 3 ) 2 solution. The excess metal was washed from the column 
with deionized water at a flow rate of 0.5 mL per minute and the 
10 column was then equilibrated with 5-10 volumes of equilibration 
buffer (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.06 M imidazole pH 7.0 (1 L)). 

The IMAC column was prepared by loading the 
equilibrated extract on to the IMAC column at a flow rate of 0.5 
15 mL per min. Fractions of 1 mL were collected. The column was 
washed with equilibration buffer until a baseline was reached 
(absorbance of the fractions at 280 nm was less than 2 mAU 
higher than the absorbance of the equilibration buffer). The 
adsorbed material was eluted with elution buffer (20 mM sodium 
20 phosphate buffer containing 1.0 M sodium chloride and 0.3 M 
imidazole pH 7.0 (0.2 L» and absorbance at 280 nm was 
determined on a spectrophotometer. Protein content of each 
fraction was determined as described in M. Bradford, Analytical 
Biochemistry, 72 (1976) 248, and lactate dehydrogenase activity 
25 was determined as described in F. Kubowitz and P. Ott, Biochem. Z, 
314 (1943) 94. Results indicated that more than 95% of the 
lactate dehydrogenase activity was recovered in the elution 
fractions. 
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EXAMPLE 3 

Ch aracterization of lactate dehydrogenase binding 
5 Further experiments were performed both with native 

LDH, a tetramer of about 140 kD, and a subunit of the enzyme, 
obtained after warming of the crude chicken muscle extract to 
45°C for 10 minutes. Both the tetramer and the subunit were 
allowed to associate with the immobilized Ni support, and both 
10 forms of LDH were retained. This result demonstrates that the 
retention of the LDH enzyme on immobilized Ni is not peculiar to 
the tetrameric form of the peptide; that is, binding does not 
require "cooperation" between subunits. Instead, the single 
subunit of the enzyme also had affinity for the nickel ion, and this 
15 affinity was demonstrated to be virtually identical to the affinity 
shown for the tetramer. Both the native protein and the subunit 
were adsorbed in buffer with an imidazole concentration up to 6 0 
mmol and both were eluted completely at a concentration of 300 
mmol imidazole. 

20 To ascertain that it is the polyhistidine portion of the 

LDH that provides affinity for the nickel ion, the tetramic form of 
the LDH enzyme was subjected to CNBr cleavage to produce a 
mixture of peptides. This mixture of peptides was applied to a Ni- 
IDA column with metal ion capacity of 32 mmol per mL gel. 

25 Loading conditions were the same as those used for the 
purification of the enzyme from the crude extract, described 
above. The adsorbed material was eluted with 300 mmol 
imidazole, and subjected to RPC chromatography. The 
chromatographic peak containing about 80% of the adsorbed 

13 
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material was then subjected to amino acid analysis. The results 
obtained demonstrate that this peak corresponds to the N- 
terminal peptide from LDH and that this peptide that contains the 
polyhistidine sequence. In addition, the fact that the peptide 
5 retained its binding affinity even after treatment with CNBr in 
presence of 70% TFA is proof that the binding is not due to a rigid 
secondary conformation structure. 



10 EXAMPLE 4 

Eun fis alifiB " f >act at * dshyiEQgsnass on Co?+-TAl.ON agarose 

Extraction of chicken breast muscle LDH was 
performed as in Example 1, and equilibrated by gel filtration on 
15 Sephadex G-25 columns with equilibration buffer (20 mM sodium 
phosphate buffer containing 1.0 M sodium chloride and 0.06 M 

imidazole pH 7.0 (I L)). 

The IMAC column was prepared in the following 
manner: Approximately 2.75 mL of Co2+-TALON Superflow 6 

20 (Amersham, Pharmacia) was transferred to a vacuum bottle, 
diluted with the same volume of deionized water and degassed 
under vacuum for 10 minutes. The gel suspension was poured 
into a column (3x1 cm. i.d.) trapped on the bottom with a 
degassed adapter and left to settle. The column was filled to the 

25 top with degassed deionized water, and a top adapter was gently 
pushed down toward the column bed until there was no space 
between the top surface of the gel and the adapter. The column 
wa s washed with 3 column volumes of deionized w ater at a flow 
rate of 0.5 mL per min. 
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Purification of the fusion protein on Co2+-TALON 
Superflow 6 was carried out by first equilibrating the IMAC 
column with 5 to 10 column volumes of the equilibration buffer. 
The sample was then loaded on the IMAC column at a flow rate of 

5 1,0 mL per min, and 1 mL fractions were collected. The column 
was washed wjth_xhe_^quilibration buffer until a baseline was 
reached (absorbance of the fractions at 280 nm as less than 2 
mAU higher than the absorbance of the equilibration buffer). 

The adsorbed material was eluted w ith elution buffer 

10 (20 mM sodium phosphate buffer containing 1.0 M sodium 
chloride and 0.3 M imidazole pH 7.0 (0.2 L)) and absorbance at 
280 nm was determined on a spectrophotometer. Protein content 
of each fraction was determined as described in M. Bradford, 
Analytical Biochemistry, 72 (1976) 248, and lactate 

15 dehydrogenase activity was determined as described in F. 
Kubowitz and P. Ott, Biochem. Z., 314 (1943) 94. As in Example 2, 
more than 95% of the lactate dehydrogenase activity was 
recovered in the elution fractions. 

20 

EXAMPLE 5 

Isolation and purification of fusion protein consisting of affinity 
peptide and G reen Fluorescent Protein UY Mutant (QFPuy) 
25 An affinity peptide/GFP fusion protein was isolated 

from E. coli cells which had been transformed with the pGFPuv.HS 
vector. Cell paste (0.39 g) was transferred to pre-cooled mortar, 
1.2 g of alumina was added, and the mixture was ground for 2 
minutes. Extraction buffer (5 mL, stored at 4°C) was added, and, 
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after additional grinding for 2 minutes, the mixture was 
transferred into four eppendorph tubes. The suspension was 
added to the eppendorph tubes and centrifuged for 12 minutes at 
12,000 rpm (11,750 x g). The clear supernatant (approximately 6 

5 mL) was used as a starting sample for IMAC. 

The extraction and chromatography equilibration 
buffers consisted of 20 mM sodium phosphate buffer containing 
1.0 M sodium chloride and 5 mM imidazole pH 7.0 (1 L). The 
elution buffer for IMAC consisted of 20 mM sodium phosphate 

10 buffer containing 1.0 M sodium chloride and 150 mM imidazole 
pH 7.0 (0.2 L). 

The IMAC was carried out in the following manner: 
Approximately 2.75 mL of Co2+-TALON Superflow 6 (Amersham, 
Pharmacia) was transferred to a vacuum bottle, diluted with the 

15 same volume of deionized water and degassed under vacuum for 
10 minutes. The gel suspension was poured into a column (3x1 
cm. i.d.) trapped on the bottom with a degassed adapter and left 
to settle. The column was filled to the top with degassed 
deionized water, and a top adapter was gently pushed down 

20 toward the column bed until there was no space between the top 
surface of the gel and the adapter. The column was washed with 
3 column volumes of deionized water at a flow rate of 0.5 mL per 
min. 

Purification of the fusion protein on Co2+-TALON 
25 Superflow 6 was carried out by first equilibrating the IMAC 
column with 5 to 10 column volumes of the equilibration buffer. 
The sample was then loaded on the IMAC column at a flow rate of 
1.0 mL per min, and 1 mL fractions were collected. The column 
was washed with the equilibration buffer until a baseline was 
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reached (absorbance of the fractions at 280 nm as less than 2 
mAU higher than the absorbance of the equilibration buffer). The 
adsorbed material was then eluted with elution buffer. 

Absorbance of each fraction at 280 nm was 

5 determined on a spectrophotometer; and protein content of each 
fraction was determined. Fluorescence of each fraction was 
determined on a microplate reader, and the purity of the fusion 
protein was determined also by SDS-electrophoresis. More than 
85% of the fusion protein was recovered in the fractions obtained. 

10 Part of the cDNA sequence, and the amino acid sequence encoded 
by this cDNA sequence, of a vector containing the affinity peptide 
at the N-terminus of Green Fluoresecent Protein-UV mutant 
(GFPuv) is shown in SEQ ID No. 1 and SEQ ID No. 2, respectively. 
The full cDNA sequence of a vector containing the construct of the 

15 affinity peptide at the N-terminus of GFPuv is shown in SEQ ID No. 
3. The full cDNA sequence of a vector containing part of the 
affinity peptide at the N-terminus of the enterokinase cleavage ^ 
site and the amino acid sequence encoded by this cDNA 
corresponding to the start of translation site, the affinity peptide 

20 and the multiple cloning site are shown in SEQ ID Nos. 4 and 5. 

EXAMPLE 6 

25 Construction of fusion proteins 

A DNA sequence corresponding to the affinity peptide 
of the present invention is fused to the DNA coding sequence of a 
protein of interest. The polynucleotide sequence for the affinity 
peptide is fused most generally at or close to the DNA sequence 
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coding for the N- or C-terminal amino acid of the protein of 
interest. This results in a DNA sequence which codes for a fusion 
protein comprising the affinity peptide and the protein of interest. 

In addition, a polynucleotide sequence that codes for a 

5 protein proteolytic site is incorporated into the fusion protein DNA 
sequence between the sequence for the affinity peptide and the 
sequence of the protein of interest. This type of DNA construct 
results in a fusion protein product having a proteolytic site. This 
site allows for the eventual regeneration of the protein of interest 

10 from the fusion protein by limited proteolysis and a second 
chromatography step. The second chromatography step, in which 
the product of the proteolysis is loaded onto an immobilized metal 
ion affinity column, results in the separation of the protein of 
interest from the affinity peptide. 

15 An additional embodiment of the present invention 

provides a DNA sequence coding for a polypeptide "secretion 
signal" introduced into the DNA that codes for the fusion protein. 
This secretion signal, when expressed, causes the fusion protein to 
be secreted into the culture media after the fusion protein is 

20 synthesized in the cell. Since a large number of cellular proteins 
are not transported out of the cell, isolation and purification of the 
fusion protein is enhanced as the requirements for cell disruption, 
extraction and removal of unwanted cell components are 
eliminated. 

25 The present invention is directed to a fusion protein of 

general formula Rl-(HX n ) m -R2 comprising: a protein of interest Rl 
or R2 fused at its amino terminus or carboxy terminus to at least 
one affinity peptide, said affinity peptide having a formula (HX n ) m , 
wherein H is histidine, X is an amino acid other than histidine, n= 



18 



WO 99/57992 



PCT/US99/10662 



1-8, m= 2-30, and wherein if n=l for more than two adjacent 
units of HX, at least one X must be asparagine, phenylalanine, 
tryptophan, tyrosine, lysine, methionine, arginine, glutamine, or 
cysteine. Preferably, n=l-4 and m=3-10. In one preferred 

5 embodiment, n=l-4 and m=3-6. Preferably, if n=l for more than 
two adjacent units of HX, only one Xis asparagine, phenylalanine, 
tryptophan, tyrosine, lysine, methionine, arginine, glutamine, or 
cysteine. In one preferred embodiment, the fusion protein has a 
sequence SEQ ID No. 1. The fusion protein may contain at least 

10 one protease cleavage site between said protein of interest and 
said affinity peptide. Preferably, the affinity peptide has affinity 
for metal ions. A representative metal ion is a nickel ion. The 
fusion protein may further comprise a secretion signal sequence. 

The present invention is also directed to a DNA 

15 sequence coding for a fusion protein of general formula Rl- 
(HX n )m-R2 comprising: a protein of interest Rl or R2 fused at its 
amino terminus or carboxy terminus to at least one affinity 
peptide, said affinity peptide having a formula (HX n )m> wherein H 
is histidine, X is an amino acid other than histidine, n= 1-8, m= 2- 

20 30, and wherein if n=l for more than two adjacent units of HX, at 
least one X must be asparagine, phenylalanine, tryptophan, 
tyrosine, lysine, methionine, arginine, glutamine, or cysteine. 
Preferably, the fusion protein has the sequence shown in SEQ ID 
No:6. 

25 The present invention is also directed to a 

P^r phinant y ftg tor comprising a DNA sequence disclosed herein^ 
wherein said recombinant vector is capable of directing 
expression of said DNA sequence for said fusion protein in a 
suitable host organism. The present invention is also directed to a 
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i 



host organism containing a recombinant vector disclosed herein, 
wherein said organism is capable of expressing said fusion 
protein. 

The present invention is also directed to a method for 
purifying the fusion protein disclosed herein, comprising the steps 
of: 

contacting a protein sample containing said fusion protein in a 
mixture with other proteins with a metal chelate resin under 
conditions where said fusion protein binds to said resin to produce 
10 1 a resin-fusion protein complex; washing said resin-fusion protein 
j complex with a buffer to remove said other, unbound proteins; 

and eluting said bound fusion protein, from the washed resin- 
4 fusion protein complex; wherein said eluted fusion protein is 
V purified. This method may further comprise the step of cleaving 
15 * said protein of interest from said affinity peptide. Moreover, this 
method further comprises the step of separating said cleaved 
protein of interest from said affinity peptide using a. metal 
chelate resin under conditions where said affinity peptide binds to 
said metal of said resin and said protein of interest does not. 

Any patents or publications mentioned in this 
specification are indicative of the levels of those skilled in the art 
to which the invention pertains. These patents and publications 
are herein incorporated by reference, to the same extent as if each 
individual publication was specifically and individually indicated 
25 to be incorporated by reference. 

One skilled in the art will readily appreciate that the 
present invention is well adapted to carry out the objects and 
obtain the ends and advantages mentioned, as well as those 
inherent therein. The present examples along with the methods, 
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procedures, treatments, molecules, and specific compounds 
described herein are presently representative of preferred 
embodiments, are exemplary, and are not intended as limitations 
on the scope of the invention. Changes therein and other uses will 
5 occur to those skilled in the art which are encompassed within the 
spirit of the invention as defined by the scope of the claims. 
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WHAT IS CLAIMED IS: 




10 



\./ A fusion protein comprising: a protein of interest 
fused at its amino terminus or carboxy terminus to at least one 
affinity peptide, said fusion protein having a formula Rl-(HX n ) m - 
R2, wherein Rl or R2 is said protein of interest, His histidine, Xis 
an amino acid other than histidine, n= 1-8, m= 2-30, and wherein 
if n=l for more than two adjacent units of HX, at least one Xmust 
be asparagine, phenylalanine, tryptophan, tyrosine, lysine, 
methionine, arginine, glutamine, or cysteine. 




The fusion protein of claim 1, wherein3=l--4 - 



15 



3. The fusion protein of claim 1, wherein m=3-10. 



20 m=3-6. 



The fusion protein of claim 1, wherein n=l-4 and 



5. The fusion protein of claim 1, wherein if n=l for 
more than two adjacent units of HX, only one X is asparagine, 
25 phenylalanine, tryptophan, tyrosine, lysine, methionine, arginine, 
glutamine, or cysteine. 
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/ The fusion protein of claim 2, wherein said 
fusion protein has a sequence SEQ ID No. 6. 

5 /I, /The fusion protein of claim 1, wherein said 

fusion protein contains at least one protease cle avage site bet ween 



said protein of interest and said affinity peptide. 



10 / 8/ The fusion protein of claim 1, wherein said 

affinity peptide has affinity for metal ions. 

The fusion protein of claim 8, wherein said metal 

:? 

15 ions are nickel -ionsr- 



10. The fusion protein of claim 1, wherein said 
fusion protein further comprises a secretion signal sequence. 

20 

11, A DNA sequence coding for a fusion protein 
comprising a protein of interest fused at its amino- or carboxy- 
terminus to at least one affinity peptide, said fusion protein 

25 having the formula Rl-(HX n ) ra -R2, wherein Rl or R2 is said 
protein of interest, n= 1-8, m= 2-30, and wherein if n=l for more 
than two adjacent units of HX, at least one X must be asparagine, 
phenylalanine, tryptophan, tyrosine, lysine, methionine, arginine, 
glutamine, or cysteine. 
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12. The DNA sequence of claim 11, wherein said 
fusion protein has the sequence shown in SEQ ID No:l. 

5 

13. A recombinant vector comprising a DNA 
sequence of claim 11, wherein said recombinant vector is capable 
of directing expression of said DNA sequence for said fusion 

10 protein in a suitable host organism. 

14. A host organism containing a recombinant vector 
of claim 13, wherein said organism is capable of expressing said 

15 fusion protein. 

j 15. A method for purifying the fusion protein of 

i 

I claim 1, comprising the steps of: 
2o! contacting a protein sample containing said fusion 

; protein in a mixture with other proteins with a metal chelate resin 
I under conditions where said fusion protein binds to said resin to 

produce a resin-fusion protein complex; 
! washing said resin-fusion protein complex with a 

25 buffer to remove said other, unbound proteins; and 

eluting said bound fusion protein from the washed 
resin-fusion protein complex; wherein said eluted fusion protein is 
purified. 
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16, The method of claim 15, further comprising the 
step of cleaving said protein of interest from said affinity peptide. 

5 

17. The method of claim 16, further comprising the 
step of separating said cleaved protein of interest from said 
affinity peptide using a. metal chelate resin under conditions 
where said affinity peptide binds to said metal of said resin and 

10 said protein of interest does not. 
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SEQUENCE LISTING 

<110> Tchaga, Grigoriy 

Jokhadze, George G. 

<120> Compositions and Methods for Protein Purification 

Based on a Novel Metal Ion Affinity Site 

<130> D6094PCT 

<140> 

<141> 1999-05-14 

<150> US 09/078,687 

<151> 1998-05-14 

<160> 6 

<210> 1 
<211> 840 
<212> DNA 

<213> artificial sequence 

<220> 

<223> Partial cDNA sequence of a vector containing the 

affinity peptide at the N-terminus of Green 
Fluorescent Protein-UV mutant (GFPuv) 



<400> 


1 










atgaccatga 


ttacgccaag 


cttgtctctc 


aaggatcatc 


tcatccacaa 


50 


tgtccacaaa 


gaggagcacg 


ctcatgccca 


caacaagatc 


agcgtggttg 


100 


gtgtgggtgc 


agttggaccg 


gtaagtaaag 


gagaagaact 


tttcactgga 


150 


gttgtcccaa 


ttcttgttga 


attagatggt 


gatgttaatg 


ggcacaaatt 


200 


ttctgtcagt 


ggagagggtg 


aaggtgatgc 


aacatacgga 


aaacttaccc 


250 


ttaaatttat 


ttgcactact 


ggaaaactac 


ctgttccatg 


gccaacactt 


300 


gtcactactt 


tctcttatgg 


tgttcaatgc 


ttttcccgtt 


atccggatca 


350 


tatgaaacgg 


catgactttt 


tcaagagtgc 


catgcccgaa 


ggttatgtac 


400 


aggaacgcac 


tatatctttc 


aaagatgacg 


ggaactacaa 


gacgcgtgct 


450 


gaagtcaagt 


ttgaaggtga 


tacccttgtt 


aatcgtatcg 


agttaaaagg 


500 


tattgatttt 


aaagaagatg 


gaaacattct 


cggacacaaa 


ctcgagtaca 


550 


actataactc 


acacaatgta 


tacatcacgg 


cagacaaaca 


aaagaatgga 


600 


atcaaagcta 


acttcaaaat 


tcgccacaac 


attgaagatg 


gatccgttca 


650 


actagcagac 


cattatcaac 


aaaatactcc 


aattggcgat 


ggccctgtcc 


700 


ttttaccaga 


caaccattac 


ctgtcgacac 


aatctgccct 


ttcgaaagat 


750 


cccaacgaaa 


agcgtgacca 


catggtcctt 


cttgagtttg 


taactgctgc 


800 


tgggattaca 


catggcatgg 


atgagctcta 


caaataatga 




840 
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<210> 



2 



<211> 



278 



<212> 



PRT 



<213> 



artificial sequence 



<220> 



<223> 



Amino acid sequence encoded by partial cDNA sequence 



of a vector containing the affinity peptide at the 
N-terminus of Green Fluorescent Protein-UV mutant (GFPu- 



Met Thr Met He Thr Pro Ser Leu Ser Leu Lys Asp His Leu He 

5 10 15 

His Asn Val His Lys Glu Glu His Ala His Ala His Asn Lys He 

20 25 30 

Ser Val Val Gly Val Gly Ala Val Gly Pro Val Ser Lys Gly Glu 

35 40 45 

Glu Leu Phe Thr Gly Val Val Pro He Leu Val Glu Leu Asp Gly 

50 55 60 

Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu Gly Glu Gly 

65 70 75 

Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys Thr Thr 

80 85 90 

Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe Ser 

95 100 105 

Tyr Gly Val Gin Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg 

110 115 120 

His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gin Glu 

125 130 135 

Arg Thr He Ser Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala 

140 145 150 

Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg He Glu Leu 

155 160 165 

lys Gly He Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys 

170 175 180 

Leu Glu Tyr Asn Tyr Asn Ser His Asn Val Tyr He Thr Ala Asp 

185 190 195 

Lys Gin Lys Asn Gly He Lys Ala Asn Phe Lys He Arg His Asn 



<400> 



2 



200 



205 



210 
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lie Glu Asp Gly Ser Val Gin Leu Ala Asp His Tyr Gin Gin Asn 

215 220 225 

Thr Pro lie Gly Asp Gly Pro Val Leu Leu Pro Asp Asn His Tyr 

230 235 240 

Leu Ser Thr Gin Ser Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg 

245 250 255 

Asp His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly He Thr 

260 265 270 

His Gly Met Asp Glu Leu Tyr Lys 

275 

<210> 3 

<211> 3384 

<212> DNA 

<213> artificial sequence 
<220> 

<223> cDNA sequence of a vector containing the construct 
of the affinity peptide at the N-terminus of GFPuv 



<400> 


3 










agcgcccaat 


acgcaaaccg 


cctctccccg 


cgcgttggcc 


gattcattaa 


50 


tgcagctggc 


acgacaggtt 


tcccgactgg 


aaagegggea 


gtgagcgcaa 


100 


cgcaattaat gtgagttagc 


tcactcatta 


ggcaccccag 


gctttacact 


150 


t tat get tec ggctcgtatg 


ttgtgtggaa 


ttgtgagcgg 


ataacaattt 


200 


cacacaggaa 


acagctatga 


ccatgattac 


gecaagcttg 


tctctcaagg 


250 


atcatctcat 


ccacaatgtc 


cacaaagagg 


agcacgctca 


tgcccacaac 


300 


aagatcagcg 


tggttggtgt 


gggtgcagtt 


ggaccggtaa 


gtaaaggaga 


350 


agaacttttc 


actggagttg 


tcccaattct 


tgttgaatta 


gatggtgatg 


400 


ttaatgggca 


caaattttct 


gtcagtggag 


agggtgaagg 


tgatgeaaca 


450 


tacggaaaac 


ttacccttaa 


atttatttgc 


actactggaa 


aactacctgt 


500 


tccatggcca 


acacttgtca 


ctactttctc 


ttatggtgtt 


caatgetttt 


550 


cccgttatcc ggatcatatg 


aaacggcatg 


actttttcaa 


gagtgccatg 


600 


cccgaaggtt 


atgtacagga 


aegcactata 


tctttcaaag 


atgacgggaa 


650 


ctacaagacg 


cgtgctgaag 


tcaagtttga 


aggtgatacc 


cttgttaatc 


700 


gtatcgagtt 


aaaaggtatt 


gattttaaag 


aagatggaaa 


cattctegga 


750 


cacaaactcg 


agtacaacta 


taactcacac 


aatgtataca 


teaeggcaga 


800 


caaacaaaag aatggaatca 


aagctaactt 


caaaattege 


cacaacattg 


850 


aagatggatc 


cgttcaacta 


gcagaccatt 


atcaacaaaa 


tactccaatt 


900 


ggcgatggcc 


ctgtcctttt 


accagacaac 


cattacctgt 


cgacacaatc 


950 



SEQ 3/7 
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tgccctttcg aaagatccca acgaaaagcg 
agtttgtaac tgctgctggg attacacatg 
taatgaattc caactgagcg ccggtcgcta 
gtcaaaaata ataggcctac tagtcggccg 
gcgtttcggt gatgacggtg aaaacctctg 
cggtcacagc ttgtctgtaa gcggatgccg 
ggcgcgtcag cgggtgttgg cgggtgtcgg 
atcagagcag attgtactga gagtgcacca 
acagatgcgt aaggagaaaa taccgcatca 
gatacgccta tttttatagg ttaatgtcat 
cgtcaggtgg cacttttcgg ggaaatgtgc 
tttttctaaa tacattcaaa tatgtatccg 
ataaatgctt caataatatt gaaaaaggaa 
tccgtgtcgc ccttattccc ttttttgcgg 
gctcacccag aaacgctggt gaaagtaaaa 
tgcacgagtg ggttacatcg aactggatct 
agagttttcg ccccgaagaa cgttttccaa 
ctgctatgtg gcgcggtatt atcccgtatt 
cggtcgccgc atacactatt ctcagaatga 
tcacagaaaa gcatcttacg gatggcatga 
gctgccataa ccatgagtga taacactgcg 
gatcggagga ccgaaggagc taaccgcttt 
atgtaactcg ccttgatcgt tgggaaccgg 
aacgacgagc gtgacaccac gatgcctgta 
caaactatta actggcgaac tacttactct 
tagactggat ggaggcggat aaagttgcag 
cttccggctg gctggtttat tgctgataaa 
gtctcgcggt atcattgcag cactggggcc 
tcgtagttat ctacacgacg gggagtcagg 
agacagatcg ctgagatagg tgcctcactg 
agaccaagtt tactcatata tactttagat 
aatttaaaag gatctaggtg aagatccttt 
atcccttaac gtgagttttc gttccactga 
gatcaaagga tcttcttgag atcctttttt 
tgcaaacaaa aaaaccaccg ctaccagcgg 
gagctaccaa ctctttttcc gaaggtaact 
accaaatact gtccttctag tgtagccgta 
actctgtagc accgcctaca tacctcgctc 
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tgaccacatg gtccttcttg 1000 
gcatggatga gctctacaaa 1050 
ccattaccaa cttgtctggt 1100 
tacgggccct ttcgtctcgc 1150 
acacatgcag ctcccggaga 1200 
ggagcagaca agcccgtcag 1250 
ggctggctta actatgcggc 1300 
tatgcggtgt gaaafcaccgc 1350 
ggcggcctta agggcctcgt 1400 
gataataatg gtttcttaga 1450 
gcggaacccc tatttgttta 1500 
ctcatgagac aataaccctg 1550 
gagtatgagt attcaacatt 1600 
cattttgcct tcctgttttt 1650 
gatgctgaag atcagttggg 1700 
caacagcggt aagatccttg 1750 
tgatgagcac ttttaaagtt 1800 
gacgccgggc aagagcaact 1850 
cttggttgag tactcaccag 1900 
cagtaagaga attatgcagt 1950 
gccaacttac ttctgacaac 2000 
tttgcacaac atgggggatc 2050 
agctgaatga agccatacca 2100 
gcaatggcaa caacgttgcg 2150 
agcttcccgg caacaattaa 2200 
gaccacttct gcgctcggcc 2250 
tctggagccg gtgagcgtgg 2300 
agatggtaag ccctcccgta 2350 
caactatgga tgaacgaaat 2400 
attaagcatt ggtaactgtc 2450 
tgatttaaaa cttcattttt 2500 
ttgataatct catgaccaaa 2550 
gcgtcagacc ccgtagaaaa 2600 
tctgcgcgta atctgctgct 2650 
tggtttgttt gccggatcaa 2700 
ggcttcagca gagcgcagat 2750 
gttaggccac cacttcaaga 2800 
tgctaatcct gttaccagtg 2850 



SEQ 4/7 
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gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg 2900 
atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca 2950 
cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag 3000 
cgtgagctat gagaaagcgc cacgcttccc gaagggagaa aggcggacag 3050 
gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 3100 
cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 3150 
tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg 3200 
gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc 3250 
cttttgctca catgttcttt cctgcgttat cccctgattc tgtggataac 3300 
cgtattaccg cctttgagtg agctgatacc gctcgccgca gccgaacgac 3350 
cgagcgcagc gagtcagtga gcgaggaagc ggaa 3384 



<210> 4 
<211> 2754 
<212> DNA 

<213> artificial sequence 

<220> 

<223> cDNA sequence of a vector containing part of the 

affinity peptide at the N-terminus of enterokinase 
cleavage site 



<400> 


4 










gacgaaaggg 


cctcgtgata 


cgcctatttt 


tataggttaa 


tgtcatgata 


50 


ataatggttt 


cttagacgtc 


aggtggcact 


tttcggggaa 


atgtgcgcgg 


100 


aacccctatt 


tgtttatttt 


tctaaataca 


ttcaaatatg 


tatccgctca 


150 


tgagacaata 


accctgataa 


atgcttcaat 


aatattgaaa 


aaggaagagt 


200 


atgagtattc 


aacatttccg 


tgtcgccctt 


attccctttt 


ttgcggcatt 


250 


ttgccttcct 


gtttttgctc 


acccagaaac 


gctggtgaaa 


gtaaaagatg 


300 


ctgaagatca 


gttgggtgca 


cgagtgggtt 


acatcgaact 


ggatctcaac 


350 


agcggtaaga 


tccttgagag 


ttttcgcccc 


gaagaacgtt 


ttccaatgat 


400 


gagcactttt 


aaagttctgc 


tatgtggcgc 


ggtattatcc 


cgtattgacg 


450 


ccgggcaaga 


gcaactcggt 


cgccgcatac 


actattctca 


gaatgacttg 


500 


gttgagtact 


caccagtcac 


agaaaagcat 


cttacggatg 


gcatgacagt 


550 


aagagaatta 


tgcagtgctg 


ccataaccat 


gagtgataac 


actgcggcca 


600 


acttacttct 


gacaacgatc 


ggaggaccga 


aggagctaac 


cgcttttttg 


650 


cacaacatgg 


gggatcatgt 


aactcgcctt 


gatcgttggg 


aaccggagct 


700 


gaatgaagcc 


ataccaaacg 


acgagcgtga 


caccacgatg 


cctgtagcaa 


750 


tggcaacaac 


gttgcgcaaa 


ctattaactg 


gcgaactact 


tactctagct 


800 
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tcccggcaac aattaataga ctggatggag gcggataaag ttgcaggacc 850 
acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg 900 
gagccggtga gcgtgggtct cgcggtatca ttgcagcact ggggccagat 950 
ggtaagccct cccgtatcgt agttatctac acgacgggga gtcaggcaac 1000 
tatggatgaa cgaaatagac agatcgctga gataggtgcc tcactgatta 1050 
agcattggta actgtcagac caagtttact catatatact ttagattgat 1100 
ttaaaacttc atttttaatt taaaaggatc taggtgaaga tcctttttga 1150 
taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 1200 
cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg 1250 
cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 1300 
ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 1350 
tcagcagagc gcagatacca aatactgtcc ttctagtgta gccgtagtta 1400 
ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct 1450 
aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 1500 
ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga 1550 
acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 1600 
actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag 1650 
ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag 1700 
cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt 1750 
cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 1800 
gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc 1850 
ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc 1900 
tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc 1950 
gccgcagccg aacgaccgag cgcagcgagt cagtgagcga ggaagcggaa 2000 
gagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc cgattcatta 2050 
atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca 2100 
acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac 2150 
tttatgcttc cggctcgtat gttgtgtgga attgtgagcg gataacaatt 2200 
tcacacagga aacagctatg accatgatta cgccaagctt gaaggatcat 2250 
ctcatccaca atgtccacaa agaggagcac gctcatgccc acaacaagat 2300 
cgatgacgat gacaaagtcg acggatcccc gggtaccgag ctcgtaatta 2350 
gctgagaatt cactggccgt cgttttacaa cgtcgtgact gggaaaaccc 2400 
tggcgttacc caacttaatc gccttgcagc acatccccct ttcgccagct 2450 
ggcgtaatag cgaagaggcc cgcaccgatc gcccttccca acagttgcgc 2500 
agcctgaatg gcgaatggcg cctgatgcgg tattttctcc ttacgcatct 2550 
gtgcggtatt tcacaccgca tatggtgcac tctcagtaca atctgctctg 2600 
atgccgcata gttaagccag ccccgacacc cgccaacacc cgctgacgcg 2650 
ccctgacggg cttgtctgct cccggcatcc gcttacagac aagctgtgac 2700 
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cgtctccggg agctgcatgt gtcagaggtt ttcaccgtca tcaccgaaac 2750 
gcgc 2754 



<210> 


5 




<211> 


45 




<212> 


PRT 




<213> 


artificial 


sequence 


<220> 






<223> 


Amino acid 


sequence 



to the start of translation site, the affinity 
peptide and the multiple cloning site 
<4Q 0> 5 

Met Thr Met He Thr Pro Ser Leu Lys Asp His Leu He His Asn 
5 10 15 

Val His Lys Glu Glu His Ala His Ala His Asn Lys He Asp Asp 

20 25 30 

Asp Asp Lys Val Asp Gly Ser Pro Gly Thr Glu Leu Val He Ser 

35 40 45 



\ 

J 



\ 



<210> 


6 




<211> 


32 




<212> 


PRT 




<213> 


artificial 


sequence 


<220> 






<223> 


Amino acid 


sequence 



<400> 



for the affinity peptide 
6 



SerTrar Lys Asp^HI sTLeu lie His Asn-V^; H^^Lys^^qju^His* 
5 10 15 

Ala : i^^A^^is%^sn^Lys He Ser Val Val Gly Val Gly Ala Val 

. - ^ ^ 3Q 

Gly Met 



< 
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